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Abstract 

Although muhi-institution data is more available than ever, researchers still face daunting 
hurdles when collecting data from multiple institutions. These hurdles are particularly high for 
stand-alone projects. This paper argues that despite the difficulties, results from multi- institution 
research benefit both participating colleges or umvershies and higher education as a whole. 

In support of this argument, a detailed case study of the data collection for one multi- 
institutional research project is reported. This case study describes an investigation into the 
departmental factors associated with disproportionate loss of women from undergraduate 
mm p iit ing majors. The case study identifies data sources, illustrates challenges to collectii^ data 
from numerous institutions, and notes potential rewards for both higher education and 
participating institutions. 
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Data for Multi-Institution Research 

Multi-institution research benefits both participating institutions and higher education as 
a whole. When colleges and universities join together to investigate common concerns, 
institutions gain information that enhances program evaluation, planning, and decision-making, 
particularly regarding their position relative to other institutions (Trainer, 1996). The money 
spent on participating in multi-institution research saves participating institutions fi-om the high 
cost of uninformed decisions (Hackett, 1996). H^her education as a whole benefits from multi- 
institution research because it gains information that both satisfies demands for accountability 
and increases understanding of important issues. Examples of the issues addressed by multi- 
institution research include racial equity (Pavel and Reiser, 1991), mstitutional quality 
(Vinsonhaler and Vinsonhaler, 1991), and financial concerns (Creswell, Chronister, Brown, 
1991). These examples illustrate how the broad interests of society and higher education advance 
when generalizable findings and conq)arative measures are produced through research involving 
many institutions. 

Technological progress has improved the processes for collecting original data fixim large 
populations, (for example, see Dillman, 2001). Technology has also fecilitated implementation 
of student tracking systems (Borden, 1995) and the resulting improvements in the coUection of 
secondary data fi-om multiple institutions. Official data repositories such as state higher 
education councils, WebCASPAR, data-sharing consortiums, and offices of institutional research 
are outstanding sources of data fi-om multiple institutions. In particular, these large student unit 
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databases are often weU established, contain a wealth of information, and cover broad 
geographical areas. Their worth and increasing prevalence in the United States are documented 
(Russell and Chisholm, 1995), and can be observed by visiting websites such as AIR’s “Internet 
Resources for Institutional Research” at http://airweb.org/links/intemetreports.html. By 1999, 
forty-two states had postsecondary education data systems, most of which were unit record 
databases (Russell, 1999.) Their usability improves over time as easily demonstrated by 
con^)aring WebCASPAR’s functionality today with that described by Firaberg in 1991. 
However, when a project’s needs do not fit with statistics that have already been produced, the 

persisting limitations of these data sources become apparent. 

Regardless of the benefits and recent advances in methodology, there are still hurdles that 
must be overcome when gathering data for multi-institution research. Difficulties occur when 
data are unavailable fi-om official sources, a situation that occurs for a variety of reasons. 
National and statewide databases may not contain necessary mfi)rmation because the data were 
not collected. Even if the desired information does exit, it may not be available because the 
storage format can make data difficult to extract Likewise, proprietors of institutional databases 
are not always able or willing to accommodate external requests. Sapp (1996) suggested possible 
reasons for reluctance in relation to data-sharing consortia: the cost associated with personnel 
and time; concerns over whether institutional comparisons are valid or potentially unfavorable; 
and concerns about sensitive data. 

The foltowing case study illustrates the current state of multi- institutional research. It 
specifies data sources, methods, hurdles, and the potential rewards to higher education and 
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participating institutions. This detailed example is presented to encourage continued 
development of data resources and increased access to data, and to promote broad support for 
valuable multi-institution research. 

Case study of a multi-institution research project 

“Departmental Factors in Gendered Attrition fix)m Undergraduate IT Majors,” is a study of 
computer science program retention by gender ttot began in the fell of 2000. This three-year 
project, fended by the National Science Foundation, was buih upon a statewide study in 
Virginia. Expanding on the pilot, the nationwide study examines a timely and important issue 
that is relevant to both our nation’s economic well-being and gender equity. The study is an 
attempt to determine why many, but not all, undergraduate computing programs lose women to 
other majors at higher rates than they lose men. In a similar fashion to Young and Reglinger’s 
(2000) focus on student flows through academic majors, this study tracks the outflow of students 
from conq)uting majors. The results will identify v^feich departmental fectors influence the size of 
the gender gap in attrition from this discipline. 

The focus of the study is on department-level influences, as opposed to gender or 
socialization influences on students. The unit of analysis is the department, so we take a novel 
approach to t^^ilnilating our dependent variable - gendered attrition rates. Gendered attrition rates 
are the difference between a department’s male and female average annual rates of migration out 
durin g the six study years. By comparing attrition rates for men and women within the same 
department, the gendered attrition rate measures the difference in outcomes for two groups in the 
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same environment. Thus, this measure is a useful tool for investigating gendered outcomes 
across institutions because it controls for departments’ overall attrition rates. 

Data from surveys and interviews are aggregated by department to measure most of the 
independent variables. These variables include numerous aspects of departmental resources and 
faculty practices and attitudes. Aggregate data from official records of enrollment and graduation 
measure the dependent variable and the gender composition independent variable. 

Two hundred ten study departments were selected because they rank among the most 
prestigious computer science programs and/or they recently graduated the largest numbers of 
conq)uter science baccalaureates in the contiguous United States. We identified these 
departments based on 1996 and 1997 degrees-granted data from the National Science 
Foundation’s WebCASPAR website (http://caspar.nsf.gov.) As a group, the study departments 
produce approximately three fifths of the coirq)uter science bachelors degrees in the nation. 

Efforts to obtain the data for the dependent variable have been in progress since the 
project began. Acquiring these data has been the most challenging aspect of the project. To 
overcome low initial response, we successively contacted several potential data sources for each 
study institution. These sources included government data repositories, sponsors within the study 
institutions, and institutional researchers. None of these approaches has yet been highly 
successful, although each has yielded some results. Efforts will continue until analyses are well 
underway, so that the maximum number of institutions can be included. 

The next two sections describe how the interview and survey data were obtained from 
individual respondents at various institutions. These tasks were undertaken durii^ the spring 
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semesters of 2001 and 2002. The third section describes the various approaches to obtaining data 
from official sources. 

Interviews at 18 Institutions 

The goal of the interview phase of the project was to collect qualitative data from multiple 
institutions. These data would ground the study in the realities of undergraduate computer 
science education and describe participants’ experiences in their own words. The interview phase 
was highly successful, although there were significant costs and logistic considerations. The 
results were valuable both as research products and as enhancements to the quality of the overall 
project. 

Interviewer teams traveled to four urban locations across the United States and three non- 
urban locations in Virginia. The urban sites were selected for variety in geographic location and 
for availability of several study institutions in each locale. The selected urban locations were 
New York City, Chicago, San Diego/Los Angeles, and Atlanta. At each of these locations, 
specific computer science departments were selected for variety in gender balance and 
institutional type. The non-urban sites were a convenience sample. 

Once sites were selected, department chairpersons were contacted by letter and by 
telephone to request permission for a visit. Cooperation was high - 18 of the 22 departments 
(82%) contacted agreed to be visited. However, the number of people who participated at each 
site did not always meet expectations. We maximized faculty and student participation by 
scheduling multiple researchers in a department at one time and spread over an entire week. 
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Student focus groi^) participants were offered pizza and soda, and in a few cases, departments 
provided incentives such as t-shirts. 

Four interviewers (the Princ^al Investigator and three trained, experienced graduate 
students) traveled to each she. During each trip, interviewers visited three or four institutions to 
interview all willing participants. In almost every case, the imdergraduates were interviewed in 
sex-segregated focus groups. Faculty and administrator interviews were one-on-one and semi- 
structured. As a result of this process, the interviewers spoke with 325 members of 
undergraduate computer science programs: 23 faculty who were chairpersons or administrators, 
120 teaching faculty, and 182 undergraduates in 31 groups. Participants were all recruhed and 
scheduled by the chairperson or the chairperson’s designated representative. 

Interviewers asked chairpersons about program and &cuhy characteristics, teaching 
enqihasis, resources, and their observations regarding the relationship between students’ sex and 
retention in their conq)uting program. Faculty were pron^>ted to speak about their teaching and 
mentoring, the environment in the department, and their observations of the relationship between 
students’ sex and retention. Students were asked to discuss what drew them to conq)uting and 
their particular program, positive and negative experiences, and how they cope with negative 
experiences. 

All but two participants agreed to be audio-taped; unt^)ed interviews were recoded with 
interviewer notes. Audio tapes were copied and transcribed, and the transcriptions were reviewed 
for accuracy. Analysis of the transcripts and notes is continuing with particular attention to 
differences between departments. The analyses have already produced some interesting 
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pre liminar y results on issues such as variation in pedagogical practices, the gendered nature of 
student satisfaction with teaching, attitudes and behaviors with respect to academic dishonesty, 
and the appeal of con^uter science. 

The qualitative conponent of the project contributed significantly to the overall project. 
Visiting a variety of institutions made it possible to compare departments, leading to 
observations of some surprising commonalities. For exanqile, students frequently raised the issue 
of cheating. The intensity of their feelings and the degree to which they found cheating to be 
discouraging were issues not noted in the literature on student retention in con^iuting majors. 
The relationship between academic dishonesty and student outflow from confuting majors bears 
further examination. Another contribution of the broadly based qualitative data was to inform the 
construction of the survey that was subsequently sent to all study institutions. Both questions and 
response categories were inqjroved as a resuh of findings from the interviews and focus groups. 

The challenges involved with this phase of the project included the eiqiense of travel and 
interviewer pay, coordination of travel and site arrangements, and quality control of the 
interviews and focus groups, including transcription and transcript coding. Respectively, these 
challenges were overcome with grant funds, patient and persistent communication and 
negotiation, and close review of all work. 

The rewards for participating institutions were confidential summary reports describing 
the various characteristics and practices of each interview site. These summary reports contained 
descriptive information on student reasons for choosing a computing major and their particular 
program; positive and negative experiences students had in the program; how students cope; 
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faculty pedagogical and mentoring practices; support programs for students; faculty estimations 
of student quality and opinions of student qualities necessary for success in the program; faculty 
evaluations of collegiality, the gender climate in the department, and institutional support the 
program received; and faculty reasons for considering leaving their position. Conparative 
reports will be posted on the project website so that deparhnents can evaluate their own 
interview results relative to other anonymous programs. The benefit to higher education was 
detailed descriptive knowledge about a variety of conditions and practices in academic 
departments. This information deepens imderstanding of student and faculty concerns and how to 
cope with these concerns. 

Surveys at 210 Insttivtions 

The goal of the survey phase of the project was to measure characteristics and practices of all 
study departments. We collected quantitative data fi-om chairpersons and faculty via web, mail, 
and telephone surveys. The web survey was the first response mode offered; mail was second; 
and telephone was third. Overall, collection of survey data was highly successful. 

To create a sanq)ling fi’ame, contact information for faculty and chairpersons was 
obtained fi’om the websites of study departments. The information was verified by administrative 
assistants via telephoi^ conversations or faxed responses. This collection and verification was a 
very labor-intensive process. 

The chairperson of every study department was selected to participate in the survey. 
Chairperson survey questions addressed features of their undergraduate computing programs. 
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including emphasis on teachin g , resources, concerns, and personal demographic information. In 
addition to chairpersons, a stratified random sample of up to 25 feculty was selected for each 
study department with women over-sampled. Faculty questions addressed professional activities 
and departmental life, focusing on pedagogy, mentoring, evaluations of students and the 
department, and personal demographic information. 

Each selected department head and faculty member was contacted at least ten times 
before non-response was accepted as a refiisal to participate. This effort resulted in an overall 
response rate of 67% from an eligible sanq)le of 2,526 faculty and 71% for 209 chairpersons. Of 
the 1683 faculty respondents, 1,514 (90%) responded online, 137 (8%) responded by mail, and 
32 (2%) responded by telephone. Of the 152 responding department heads, 138 (91%) responded 
o nline , 12 (8%) responded by mail, and 2 (1%) responded by telephone. 

Although the survey data are still being prepared for analysis, they have already 
demonstrated the worth of this multi-institution research. Interesting and thought-provoking 
results are emerging. For example, the preliminary results from responses via the web show (in 
Figure 1) that 41% of the 1,502 con^uter science feculty who responded to this question had 
seriously considered leaving their position in recent years. The most common motivation for 
their deliberation was dissatis&ction with the institutional support their department received - 
42% of those who considered leaving rated this dissatisfaction as a strong or very strong 
motivation. Other powerful motivations were dissatisfaction with departmental leadership (37%), 
the desire for career change or advancement (35%), money (29%), and the desire for better 
students (27%). Except for retirement, teaching was the weakest stimulus for thoughts of leaving 
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- only 6% of those who considered leaving were stroi^ly motivated by the desire to teach more; 
16% were stroi^ly nwtivated by the desire to teach less. According to chairpersons, the actual 
turnover in study departments averaged about 3% annually, and 17% overall during the study 
period. However, given the opportunities available to computer scientists and the shortage of 
faculty in this discipline, institutions might regard this potential for turnover as a serious cause 
for concern. 

The challei^es involved with the survey phase of this project were the cost of running a 
large survey, persuading busy faculty and chairpersons to respond, and coordinating efforts with 
a survey research center hired to implement the mail and telephone components of the project. 
These challenges were overcome by leading with the web version of the survey, which required 
no printing, nnailing, data entry, or long-distance calling costs; use of the token incentive; polite, 
but persistent contacts; and close interactions with responsible staff including periodic meetings 
and frequent progress reports. 

The benefits for participating departments were summary data measuring conuiKin 
discipline concerns and teaching and mentoring practices in the average CS department. The 
benefit to higher education was generalized findings that can inform discussions on subjects such 
as pedagogical practices, feculty retention, and gender issues in an economically important 
discipline. 
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Official Data 

The goal of ob tainin g official enrollment and disposition data was to measure the cortqtoshion of 
the student body in each study program, and to calculate the rates at which men and women 
migrated out of the major during the study period. The official data made it possible to 
statistically relate measures of departmental conditions and practices with departmental gendered 
attrition outcomes. Without these official data, the project would produce many interesting and 
useful findings, but it would not identify which departmental fectors significantly influence the 
size of the gender gap in attrition fi’om computing majors. Unfortunately, obtaining the official 
data posed the greatest challenge of any aspect of the project. 

In order to calculate the rates at which male and female students leave computing majors, 
we requested that study institutions provide the following data aggregated fi’om student unit 
records. 

For each year fi’om 1994 through the most recent year available, provide a fall 
headcoimt of computer science majors (CIP code 1 100) broken down by sex, by 
level,* by CIP sub code, and by full or part-time status. Show average Math SAT 
score and average major GPA for each group. Track each group to the fall of 
subsequent years and identify their dispositions. These dispositions are: number 
still enrolled with the same major, number still enrolled but with a different 

' Including all levels of students each year overannes problems with small numbo^ that can be ^countered whai 
tracking out a single coh(Ht of new students. It is cmnparable to d^ographers’ methods for calculating population 
death rates. 



ERIC 



14 



Data for Multi-Institution Research 14 



major, number graduated in the same major, number graduated in a different 
major, and number no longer enrolled. 

These data can be used to calculate measures of programs’ gender conqmshion at different points 
in time, male and female annual rates of migration from the major, and annual and study period 
average gendered attrition rates (the gap in male and female migration rates). 

Government sources 

National databases that contain institutional data on enrollment and graduation numbers were the 
first choice as a source for the information needed to calculate gendered attrition rates. We began 
with Web CASPAR, the National Science Foundation’s online source of statistics on academic 
science and engineering in the United States. This site provides access to data from a variety of 
sources including the Integrated Postsecondary Education Data System (IPEDS). 

However, after careful investigation, it became clear that national databases do not 
contain the data needed for quantifying undergraduates’ migration out of computii^ programs. 
The IPEDS data break down students by major only at graduation; enrollments are not grouped 
by major, making it impossible to track or qiproximate annual migrations out of a major. 

The second choice of a data source that maximizes results while minimizing effort was 
statewide data repositories. This tqiproach had been very successful in Virginia where the State 
Council of Higher Education for Virginia (SCHEV) supplied the data for all 23 programs in the 
pilot study for this project. SCHEV graciously agreed to extend their support by supplying 
updated statistics for the national study. 
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To enlist support and gain legitimacy, SHEEO, the nationwide State Higher Education 
Executive Officers association was asked to help persuade state higher education councils to 
provide the requested data. This approach yielded commitments from five states, three of which 
have complied thus far. Seven states responded with regrets that they did not have the data 
needed for this study. Two states offered to make their data available for a fee. In one case, the 
cost was estimated at a total of $1000 for the data from any number of colleges or universities in 
the state. The representative of one systemwide data repository declined because the request 
“results in a fair amount of data processing for us, which I am not inclined to take on.” To date, 
approaching state couikUs of higher education has yielded a total of 30 data sets out of the 209 
requested. An additional three states have promised to supply 19 data sets. With continued 
attempts to reach non-responding councils known to have student unit record data, the number of 
data sets may still increase. 

Sponsored requests 

The limited results yielded by appeals to statewide data repositories pron^ted another approach. 
We directly contacted the computer science chairpersons in the remaining institutions. Our 
rationale was that, as interested parties and members of the study institution, their sponsorship 
would carry more weight than an external request. 

F.mail sent to department heads in the Spring of 2001 described the project, its 
endorsements, and the anticipated outcomes, and asked them to obtain the data. Based on prior 
experience with institutional researchers who denied requests because they did not fit with 
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previously produced reports, we suggested that an effective data source might be the information 
technology staff in the Registrar’s office. In some cases, this suggested approach was successful. 
There were also departments that had administrators in possession of the necessary data, and in 
many other cases, departments found their Institutional Research Office to be the most 
appropriate source. 

In conjunction with these sponsored requests, we created a website for the project. The 
website described the project and gave details about the data needed. It provided an exanqile of 
the suggested file layout and defined terms for the project. Each email to computer science 
department heads included the address for this webpage as an easy reference 
(http;//curry.edschool. Virginia. edu/ITattrit/). 

Perhaps because chairpersons were also asked to conqilete a questionnaire in the survey 
portion of this project, the sponsored request jqiproach to gathering data for the dependent 
variable yielded only twelve commitments to provide data. Of these commitments, 5 data sets 
have arrived to date. In one case, the data have been delayed by requirements that a university’s 
institutional review board consider the project. Despite the feet that the project s home 
institutional review board had already considered and approved the protocols, and that only 
aggregate was being requested, a full review was conducted before the data could be sent. 

Data-sharing consortia and Institutional Researchers 

Twenty-three study institutions belong to a data-sharing consortium that focuses on retention of 
science, mathematics, engineering, and technology majors. The data member institutions supply 
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the consortium are similar, but not identical to, the data needed for this gendered attrition study. 
Thus, although data from the consortium would not meet the needs of this project, consortium 
members should have the capability to provide the data needed for this study. 

Beginning in summer 2001 and continuing through winter, we directly contacted 
institutional researchers at non-responding colleges and umversities. The email message 
contained a brief description of the projects’ goals and endorsements, an outline of the requested 
d a ta , and a link to the project website providing a more detailed description of the necessary data 
and background on the project. Members of the data-sharing consortium were alerted to the 
similarity of the requested data to the data they supplied the consortium. 

The results from direct contact with institutional researchers were 27 data sets received 
and 1 1 promised but not yet sent. Of the responding institutional researchers, all but four were 
members of the data-sharing consortium. As was the case with state higher education councils 
and with chairpersons, noost institutional researchers who were contacted did not reply. Follow- 
up contacts are continuing with non-respondents. 

Among those institutional researchers who replied but declined to participate, by far the 
most common reason for not providing the data was a lack of sufficient resources. For example, 
one institutional researcher wrote, 

[Wje're drowning in deadlines and ever-expanding external demands, along with 
a massive system-wide changeover to [different software]— the effects of which I 
can describe in one word: Aaaarrggghhh! Our tiny staff has not been able to 
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respond to requests like yours, which are perfectly reasonable and do represent 
areas of research that we've been trying to get to for our OWN campus's use! 

Despite interest in the topic and desire to assist with this project, this office simply did not have 
the staff time. 

The second most common reason given for not providii^ the data was that it had not 
been collected. For example, the following institution was only beginning to keep records of 
dropouts. 

I really doubt that we have anything of value. We are just starting to identify our 
dropouts generally and will probably be a year at that task . . . 

Under these conditions, it would be a particularly time consuming task to locate annual lists of 
undergraduate con^uter science majors and graduating seniors, and match them from year to 
year. Simply not having the data in a useful format makes it hard to recognize it as suitable for 
this study, and more resource intensive to use if it is recognized. Despite the offer to have project 
staff work from annual lists of students by ID code, sex, level, etcetera, and convert them into the 
necessary format, no institution chose that option. 

Results from collecting official data 

After two years of continual attempts to gather official data for calculating gendered attrition 
rates, we achieved a co mplian ce rate of 30%. The most productive source was the state higher 
education councils, which supplied 48% of the data sets received to date. Last call notices may 
result in some additional data before analyses are underway. 
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Regardless of the importance of these ofBcial data, gathering them has been the least 
successful aspect of this project. The disappointing results with this key element for analyzing 
departmental effects on the disproportionate loss of female students have delayed the project. 
Deadlines have been extended so that overworked institutional researchers can comply with our 
request, and so that new approaches for increasing participation can be employed. For example, 
several study institutions are part of a citywide university system. The system administration 
may serve as a source of data for all the non-responding institutions located in that city. 

The challenges of the official data conqxment of this project were identifying appropriate 
sources of the data, enlisting cooperation from database proprietors, tracking the status of 
requests, and working with variations in the data supplied by different sources. Governmental 
data sources were located with the assistance of the State Council of Higher Education for 
Virginia and SHEEO. Institutional sources were located by student assistants vsiio searched 
webpages and called institutions to obtain names and email addresses. Enlisting cooperation was 
most effective when the proper data source was located; when the request was clear and details 
were provided, as acconqilished by the sample layout on the project website; and when 
flexibility in deadlines and data content was offered. Tracking the status of requests required a 
separate database. This database contained an institution ID, name, position, email address and 
telephone number for each contact person, an identifier for the current contact, and notes 
describing the status of the request. Standardizing the data provided by different sources 
involved careful comparisons of definitions and calculations of proxy measures when the 
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requested information was not available. However these procedures improved our ability to 
obtain the official data, they were only marginally effective. 

As with the other conqwnents of this project, the results from analyses of the official data 
offered some obvious benefits for participating institutions and for higher education. The 
benefits for institutions supplying the official data included experience with tracking the outflow 
of students from degree programs; a snapshot of conditions in one of their largest, most in- 
demand majors; and access to statistics measuring gender con^shion and migration patterns at 
other institutions. An example of the data products can be seen at the website presenting 
descriptive results from the statewide project at httpl/Zfriculty. virginia.edu/attrition-cs-bio/ , where 
statistics for computer science and biology departments in Virginia are shown. For higher 
education in general, the rewards included a contribution to understanding of the process by 
which men and women are segregated into particular disciplines, insight into the general 
relationship between departmental characteristics and undergraduate retention in particular 

programs, and a measure of gendered attrition rates that can be used to compare program 

1 

outcomes across institutions. 

Discussion 

This case study described the methodology employed for a large multi-institutional research 
project. The goal of the project was to identify departmental factors that affect equality of male 
and female undergraduate retention in computing majors. Quantitative and qualitative data were 
collected from students, fecuhy, and chairpersons in the largest and/or most prestigious con^juter 




2 



X 



Data for Multi-Institution Research 21 



science programs in the contiguous United States. In addition, official enrollment and disposition 
data were gathered for calculating measures of student migration from participating departments 
during a six-year period. 

The project methodology presented different degrees of challenge, and had varying levels 
of success as shown in Figure 2. Data collection by interview and survey was demanding but 
profitable. In comparison, collection of data from official sources was more difficult and met 
with less success. 

The data collection methods for the qualitative portion of this project followed 
established procedures, and were productive. Data were collected directly from individual 
participants at study departments through interviews and focus groups. The departmental 
participation rate for this con^wnent of the project was high (82%). 

Most chairpersons were open to hosting site visits, graciously making physical 
arrangements and encouraging members of their departments to participate. Although the level 
of individual participation varied across departments, the overall munber of department members 
interviewed was high (325). As a result, the interviews and focus groups covered a large number 
of individuals at a variety of institutions. The data they generated made a valuable contribution to 
the q ualit y of the Subsequent survey, resulted in reports that departments could use for self- 
evaluation and conq)arison with other con^uter science programs, and produced a wealth of 
observations that could stimulate new hypotheses and illustrate statistical results. Because the 
r|at a were obtained from many institutions of various types and locations, they are likely to 





Data for Multi-Institution Research 22 



represent the range of conditions present in large academic conq)uter science departments and 
depict the features common to nwst of these departments. 

The data collection methods for the survey portion of this project also followed 
established procedures, and they were also very productive. Data were collected directly from 
individual partic^ants at study departments through a questionnaire that enq>loyed the Internet, 
mail^ or telephone. The response rate for this component of the project was high (67%). 

The survey data measured the distribution of characteristics and practices in con^)uter 
science by department, by sex, and by a variety of other institutional, program, and individual 
characteristics. The example of factors affecting faculty turnover showed how muhi-mstitutional 
research can offer si gnifi cant information relevant to institutional policy. Because many faculty 
supplied the data for this result, it can be legitimately claimed that this finding represents the 
situation for the average faculty member in a large con^uter science department within a two 
percent margin of error. Thus, the finding applies broadly and cannot be dismissed lightly as 
unique to a particular institution. Furthermore, wfren analyses by department are conducted, it 
will be possible to discover \^ch departments deviate from the norm and to investigate the 
reasons for those deviations. Such comparisons across departments will offer abundant 
information for self-assessment and for deepening understanding of factors relevant to collegiate 
preparation for computing careers. 

The rates for the survey’s three response modes demonstrate that this study’s population 
was quite appropriate for an online questionnaire. A large majority of participants responded 
online (90% of those respoixling used this method), resulting in a tremendous cost saving for the 
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project. The boon of reduced cost, together with easy communication and access to public 
information illustrate how technological advances have made multi-institution research more 
feasible. However, one should not interpret these survey methodology results as having 
implications for response rates in general. The three response modes for this survey were not 
offered simultaneously, so conq)aring response rates is not legitimate. The web survey was 
available first and for the entire study period, but mail and telephone were only offered after 
responses via the web declined. Thus while the Internet provided many benefits, raising response 
rates was not necessarily one of them. 

The data collection methods for the official data portion of this project had no well- 
established procedures to follow, and were the least productive of the project methodologies. 
Official data for cfllenlating departments’ gender composition and gendered attrition rates were 
collected fi’om government sources or directly from study institutions. This process had the 
lowest response rate of the project (30% participation rate). 

Following a similar process to that used for implementing a survey, potential data sources 
were located and contacted. Request status was tracked and non-response was followed with 
additional contact attenq)ts as well as contact with alternative sources. Among those who 
responded but declined to participate, the chief reason was reluctance to commit scarce resources 
for extracting data. Among those who did participate, statewide data repositories were not the 
most cooperative, but they were the most productive (48% of the data sets can^ fi’om this 
source), making them a highly efficient data source for multi-institution researchers. The next 
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most productive, and tl^ most cooperative source was individual institutional research offices 
(44% of the data sets provided). 

The current low response rate for official data jeopwdizes the project’s ability to achieve 
its pr imar y goal of measuring the links between departmental factors and their gendered attrition 
outcomes. If only 30 percent of the study departments are represented, there is serious potential 
for bias. The gendered attrition rates of participating departments might not accurately represent 
rates for the typical large conq)uter science department. Further analyses will determine how 
representative the study data actually are, but increasing the number of participants in this aspect 
of the project would reduce the margin of error for the statistical findings based on the official 
data. 

This case study illustrated several inqwrtant issues regarding tlw current condition of 
multi-institution research. Examining the process and outcomes of this project demonstrated the 
benefits, challenges, and persisting impediments to research involving many colleges and 
universities. 

The contributions of this project confirmed the worth of multi-institution research already 
demonstrated by other research of this type. (For examples of other research, see Astin and 
Astin, 1992; and Strenta, 1994.) In the case of this study, findings can help to meet the demand 
for computer professionals, promote female participation in a rewarding career field, and deepen 
our understanding of gender segregation processes in higher education. The value of these 
findings is enhanced when they can be generalized to all large imdergraduate computing 
programs, a feature that requires inclusion of multiple and diverse institutions. 
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This case study also substaitfiated methodological advances in data collection from 
muhiple institutions. Academic computer science is particularly suited to study via cost-saving 
web surveys. The use of email &cilitated communication between both researchers and potential 
participants. Web access to academic department sites, contact information for institutional 
research staff, and government data sets made it possible to gather certain data without response 
burden. In several cases, the existence and cooperation of statewide sources drastically 
minimiz ed the effort to acquire data. 

Despite these benefits and advances, this case study also demonstrated that data 
availability still impedes the conduct of multi-institution studies. Restricted access, cumbersome 
data storage, and inadequate resources have yet to be overcome. 

F inall y^ this case Study suggested that the most efficient means of disseminating data for 
muhi-institution research is through government sources. These sources can provide access to 
high q ualit y d^ta from large numbers of institutions, minimize the data collection efforts of 
hi ghe r education researchers, and relieve institutional researchers of the burden imposed by 
extracting data for idiosyncratic requests. 

A national co mmitme nt to quality education requires a companion commitment to 
collect, maintain, and disse minat e the data that fecilitate research and inform decision-making in 
higher educatioa Government support can provide the necessary resources. Without this support, 
muhi-insthution research will fail to thrive, and we will forfeit the educational and social 
progress it generates. 




26 



Data for Multi-Institution Research 26 



References 

Astin, Alexander W. and Helen S. Astin. 1992. Undergraduate Science Education: The Impact of 
Different College Environments on the Educational Pipeline in the Sciences. Higher 
Education Research Institute. Graduate School of Education. University of California, 
Los Angeles. 

Borden, Victor M. H.. 1995. Harnessing New Technologies for Student Tracking. New 
Directions for Institutional Research. Number 87. 

Creswell, John W., Jay L. Chronister, and Martha L. Brown. 1991. The Characteristics and 

Utility of National Faculty Surveys.” New Directions for Institutional Research. Number 
69. 

Dillman, Don A. 2000. Mail and Internet Surveys: the tailored design method. New York, J. 
Wiley. 

Ewell, Peter T. Editor’s Notes. 1995. New Directions for Institutional Research. Number 87. 
Fimberg, Janies W. 1991.Access and Use of Federal Data Through NSF’s CASPAR System. 

New Directions for Institutional Research. Number 69. 

Hackett, E. Raymond. 1996. Creating a Cost-Effective Data Exchange. New Directions for 
Institutional Research. Number 89. 

Lenth, Charles S. Editor’s Notes. 1991. New Directions for Institutional Research. Number 69. 




27 



Data for Multi-Institution Research 27 



Pavel D. Michael and Mark Reiser, 1991 Using National Data Bases to Examine Minority 
Student Success in Higher Education. New Directions for Institutional Research . 
Number 69. 

Russell, Alene Bycer. 1999. The Status of Statewide Student Transition Data Systems: A Survey 
of SHEEO Agencies. Report issued by the State Higher Education Executive Officers 
Association. May 1999. 

Russell, Alene Bycer, and Mark P. Chisholm. 1995. Tracking in Multi-Institutional Contexts. 

New Directions for Institutional Research. Number 87. 

S^jp, Mary M. 1996. Benefits and Potential Problems Associated with Effective Data-Sharing 
Consortia. New Directions for Institutional Research Number 89. 

Strenta, A. Christopher, Rogers Elliott, Russell Adair, Michael Matier, and Jannah Scott. 1994. 
Choosing and Leaving Science in Highly Selective Institutions. Research in Higher 
Education VoL 35, No. 5 

Trainer, James F. 1996. Coming Aboard: Making the Decision to Join a Data-Sharing 
Organi 2 ation. New Directions for Institutional Research. Number 89. 

Vinsonhaler, Jeane C. and John F. Vinsonhaler. 1991. Developing a Data Base for Monitoring 
and Improving Quality. New Directions for Institutional Research.. Number 69. 

Yoxmg, Denise York and Lawrence J. Redlinger. 2001. Modeling Student Flows through The 
University’s Pipelines. Presented at 41st Forum of the Association for Institutional 
Research. Long Beach, California. Jime 5, 2001. 



Data for Multi-Institution Research 28 

Author Notes 

This material is based upon work supported by the National Science Foundation under 
grant number EIA0089959. Any opinions, findings, and conclusions or recommendations 
expressed in this material are those of the author and do not necessarily reflect the views of the 
National Science Foundation. 




29 



Data for Multi-Institution Research 29 



Figure 1. Sample Result from Survey Data. 
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Figure 2. Summary of Response to Study Methodologies 
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